The Spreadsheet Detective logo

Technical Supplement

Version 5.8, January 1999, Patent Pending

Contents

Note that the primary documentation for the Spreadsheet Detective is the DetectEg.xls spreadsheet which contains a series of examples that illustrate each aspect of the Spreadsheet Detective, together with a description of what each example means. This document provides supplementary information, and assumes that the reader has already read the example workbook  The example workbook has been designed to be printed if desired.  (Ordering information can be found in the Order page.)

Configuration Options

The Options entry in the The Spreadsheet Detective Menu enables The Spreadsheet Detective to be configured. By default all check boxes are unchecked, and other default values within "[ ]"s in the dialog box. The options are as follows:-

Configuration Dialog, MainOptions

Show formulas without Precedents (ie. that are just constants, eg. = 1 + 2) Normally formulas that do not reference other cells are not shown as schema annotations. These formulas may be used to calculate constants, eg. "= 1 + 2 / 3". This option controls both normal annotations and shading. However, for performance reasons the "#" notation is never used for references to cells that have a formula even if they do not have precedents.

show Solid dots as well as circles (Neater but slow on large worksheets) The solid red dots are not normally used because this can be very slow on large spreadsheets. The open circles are always produced to show missing formulas, so the annotations are never unambiguous. Solid dots are also still used in places that could cause confusion such as empty cells that contain formulas. However, the solid dots make the meanings of the annotations clearer and it is recommended that this option be chosen when presenting a finished spreadsheet to others.

Do not Highlight references to non numeric cells (Red) If a range in a formula contains a reference to a cell that does not contain a number or date it is highlighted because this often indicates an error. The first and last cell in a multi-cell range may be empty without being highlighted.

Highlight Forward references to cells below or to the right (Red) These may be highlighted because they may be considered bad style. A Forward reference is a reference to a cell in the same worksheet that has neither the row nor column less than the address of the referencing cell.

Highlight constant Numbers in formulas (Red) It is generally bad practice to include constants other than 0, 1 or 2 in formulas. However these are not highlighted by default because they are easy to find just by reading the formula.

Highlight Year 2000 Date Functions (Bold Pink) This option highlights date functions or references to cells that have been formatted as dates. These cells should be checked for Year 2000 conformity. Like all such highlighting, a summary is inserted at the end of the worksheet.

Highlight Circular references (bold red) (Can be slow) If a range in a formula contains a circular reference it is highlighted. This can be computationally expensive on large spreadsheets and so should only be used when help is required to resolve circular references. Care is taken to only highlight the cells that are actually in the cycle. It is possible to have very obscure circular references that do not include any schema and so will obviously not be highlighted. This option is not available in Excel Version 5.

Do not show cell Dependents as a diagonal line (Green/Brown) The cross reference diagonal lines can be suppressed if required.

Attempt to Qualify Duplicate AutoNames (Requires an extra PreProcess pass) This option causes the Spreadsheet Detective to attempt to qualify ambiguous AutoNames. In order to do this requires an extra pass to determine which AutoNames are ambiguous. This is not enabled by default because it is difficult to determine which labels should be used as Qualifiers on unstructured spreadsheets, and some additions of "."s and ".."s may be required.

Do not include Schemas, just show Excel Name Definitions This option completely suppresses the schemas, but still shows other details such as Excel Named Range definitions.

Layout Options

Suppress highlighting Three Dimensional schemas (Dashed Box) By default schemas are compared with the corresponding formula in the immediately preceding worksheet, and highlighted with a dashed box if they are the same.

Suppress highlighting unProtected formulas (Pink) The highlighting of formulas that are not locked can be suppressed with this option.

Suppress Range lines (Green). The green lines that indicate the scope of ranges used in formulas can also be suppressed.

Show formula schemas as Empty boxes This minimizes the space required for the annotations by reducing schemas to a small box, which just shows which cells contain related formulas. The schemas are still parsed so features such as dependent cell lines are not suppressed. The shading facility provides similar information in a more compact manner but the empty boxes may look better on a black and white printer.

Suppress Inserting "#" after references to constant inputs The "#" character that is included after references to input cells without formulas can be suppressed.

Do not expand reused precedents multiple times in precedent report Normally, if cell A references B and C, and both C and D reference E, and E references F, then E and F would be included twice in the precedent report. This option causes the second listing of E and F to be suppressed. It is not enabled by default because it can cause confusion in outlined reports.

Suppress Widening of columns to stop lines going through numbers Columns are normally widened slightly if this would prevent a number being obscured by a vertical line. In practice this happens fairly rarely because most columns have a little space to their right. However, this option suppresses this behaviour if space is at a premium.

Suppress Holistic layout when determining column sizes etc. The Spreadsheet Detective normally takes the number of other schemas in a row or column into account when determining aspect ratios and maximum formula sizes. This option can be used to suppress this action.

Do not remove window Freeze / Split in Excel 7 / 95 When the Spreadsheet Detective annotates a spreadsheet it removes any window splits and frozen panes for Excel 7/95 because this is very slow (this is not a problem in Excel 97).

Set Print Gridlines, Headings and Margins from prototype (Can be slow in Excel)  Gridlines and headings are also enabled for both display and printing. The original settings are remembered in an invisible textbox and are restored by default when the spreadsheet is unannotated. This option enables them to be not reset if the worksheet is unannotated, or to not be changed in the first place. It is best not to change these settings while a spreadsheet is annotated because the changes might be lost next time the spreadsheet is unannotated.  These values are controlled by the prototype spreadsheet.

[4] Aspect Ratio factor of schema boxes Increasing this number will make columns containing multi-line schemas wider and so reduce the vertical space required to display them. The actual aspect ratio for each cell also takes into account the existing size of the cell and the number of other formulas in the same row or column.

[40] Maximum schema size per column before moving to the end Schemas that have more than about this number of characters divided by the number of columns that are available for the schema are annotated separately at the end of the spreadsheet. The actual decision to do this also takes into account the size of the existing row or column and the number of other formulas that are in the same row or column. The new layout algorithms of V3.1 greatly reduce the need to move large formulas because it allows them to span multiple columns.

[7] Minimum Font size when Zoomed The minimum size of the fonts used. This size is absolute and is designed to ensure that the font is always readable even though the window was zoomed when the annotations were being made. Note that it is the window view zoom that is used, not the printing zoom, so that it is best to print an annotated spreadsheet with a similar zoom factor to which it is displayed subject to the accuracy of the printer. Use the "Show formula schemas as empty boxes" option for a very condensed view without the schema text being shown. The default min font size is 8 for Macintosh.

[5] Maximum number of empty cells to skip before a new schema (or /2 for non-empty)  The layout algorithms will not consider that two formulas have the same schema if more than this many empty cells or half this number of non-empty cells appear between them even if they are equivalent. This can be aesthetically more pleasing.

[10] Minimum columns before a schema is repeated at the other end If a formula has been copied more than this number of columns, the schema is repeated at the right hand side of the line. This is convenient when the window is scrolled right. Repeated Schemas are not counted as Schemas in the statistics, do not produce diagonal dependents lines, and are not separately checked for circular references.

[7] Maximum depth factor for Formula Precedents Report The Precedent Report will only list formulas to this depth regardless of the outline level. Further, only 2 ^ (MaxDepth - RecurDepth + 1) formulas will be listed in each level.

Names & Ranges Options

Suppress Hidden Name definitions, eg. Solver Names This option prevents hidden Name definitions to be shown such as those used by the Solver to store its parameters. Hidden Name definitions are shown within "()"s.

Include Hidden Names when expanding A1 refs in Schemas Normally hidden Names are not used to describe A1 references in schemas. The special Excel Names "Print_Area" and "Print_Titles" are never used.

Ignore Excel Name completely, ie. for both definition and reference (Brown) All processing of Excel Names is suppressed. AutoNames are produced instead. Rarely used.

Do not append "~Name" to A1 references in schemas for Excel Named Ranges This suppresses the expansion of A1 references with any Excel Names. AutoNames are created instead. This option is useful if the Excel Names have many errors in their definitions and so are often misleading.

Add "~A1" to Names in schemas If a user has used Excel Names rather than A1 references in a formula, The Spreadsheet Detective can append an "~=A1~" annotation to the schema to show the NameÆs definition. This is normally only done if the NameÆs definition is not displayed on the same sheet or if it defines a multiple area range. It would not be displayed on the same sheet if it refers to a range on a different sheet or because it contain a formula. These options can be used to force "~A1~" to be appended to all Names regardless whether they were displayed. They can also be used to prevent any "~A1~"s being appended.

Suppress AutoNames Completely This option suppresses both the creation and insertion of AutoNames.

Show AutoName definition annotations (Green) This causes The Spreadsheet Detective to include annotations that show the definitions of AutoNames in a similar way that Excel Name definitions are shown. AutoName definitions are not normally shown because AutoNames always appear with the original A1 references in schemas. However, showing the definitions clarifies the way the Spreadsheet Detective derives them.

Suppress using both `Row`Column (Auto)Names and only use the best one instead This option can be used to force the Spreadsheet Detective to choose either a row or column AutoName when describing an A1 reference. Complex heuristics are used, but experience suggests that using both is best when it is not otherwise clear which should be used.

Use simplified tests to determine whether a cell contains a potential label Normally the formatted value of cells is used to determine whether they contain labels or just numbers. For example, a cell formatted <"Proposition "0> would be a label, not number. Numbers such as 1989 are also considered candidate labels. These heuristics work well, but this option may be needed to disable them in unusual or non-English spreadsheets.

ReCalculate Names each time Describe 1 Cell is called (slower but more accurate) Normally Describe 1 Cell remembers any Names and Autonames that were derived previously by it or (re)Annotate Sheet(s) which avoids having to determine them again. However, this can be inaccurate if Excel Names or cell labels change between invocations. This option forces Describe 1 Cell to always determine Names and AutoNames from scratch.

[12] Max AutoName Size This parameter controls the size of the AutoName abbreviations.

[18] Max symbol size before abbreviation Symbols in formulas that contain more than this number of characters are abbreviated in a table at the end of the spreadsheet. This is most useful for abbreviating file names.

Special Options

Force Ctrl+Alt+F9 re-calculation to avoid Exel 97's recalculation problems There have been a number of bugs reported in which Excel does not recalculate cell values correctly. The Spreadsheet Detective normally forces a recalculation whenever a work sheet is annotated to resolve this if and only if Tools | Options | Calculate | Automatic has been set. However, this option can be used to override this behaviour.

Safe Mode -- Suppress features that might cause Exel to crash. Excel can perform illegal operations when processing particularly large and complex spreadsheets. In the unlikely event that this happens this option can be selected to suppress features and optimizations that Excel might be having difficulty with. These include:-

Be very careful when adding the service release patches to Excel 97. There is a complex web of procedures that are required to do this properly. The only safe solution is to insist on receiving a new CD from Microsoft that contains a complete copy of Office which includes the latest patches.  Do not accept the CD that just contains the patches.

Produce DetectL.Log diagnostic file This file can be created by the Spreadsheet Detective in the same directory as the spreadsheet, and records key actions that can be useful should Excel signal internal errors.

Produce trailing Formula Map (Traditional but not that useful) This option will cause a simple formula map can be appended to the spreadsheet. The map represents each cell in the spreadsheet with a single character that shows which cells have formulas and schemas. While these were popular in first generation auditing tools, annotating the cells directly with the Spreadsheet Detective makes the map less useful.

Suppress Annotating Charts Annotating charts makes it clear which ranges of cells they are actually describing. This is important because invalid charts produce invalid decisions.

Add Excel OLE Link Tables (Obscure) This option describes embedded OLE objects and shows which files are being referenced.  It also includes a list of any Named Ranges that contain references to other workbooks.  Such Names are not recommended -- it is best to have a separate worksheet that contains all the external references.  Note that any references to other workbooks in cell formulas are always documented regardless of this option.

Do not use local settings for formulas, use US English (most tested version)  The Detective now shows formulas in the local language.  However, this option can be used to show them in the default US English format.  This affects things like whether the decimal separator is a "." or a "," for European languages.  Far east languages etc. are not supported.

Update Prototypes The colour, size and shape of the annotations may be changed using this option. Before a spreadsheet is annotated, the Spreadsheet Detective tries to find an open workbook that contains prototype sheets. If they are found, they is used to provide prototypes of the objects that appear as annotations. This option can be used to generate such prototype workbooks which contain further documentation.  The prototypes may then be edited but this should not normally be necessary.  It is also possible to change the default print margins etc. by changing a prototype.

Edit Licence. The Detecti.ini file contains the name of the organization that the copy of the Spreadsheet Detective has been licenced to and a licence key which is a number issued by Southern Cross Software Queensland that is required to run the Spreadsheet Detective with the given organization name and expiry date. This button provides a dialog box that enables this information to be entered if necessary.  However, a Detecti.ini file is usually supplied with the Detective so this option should rarely be necessary.

Other Utilities

The following additional utilities are available on the SDetective menu:-

Comparison with Existing Excel Facilities

Several studies have shown that users are often overconfident in the correctness of spreadsheet models, and that production spreadsheets of even moderate complexity often contain significant errors that have lead to expensive mistakes and litigation. The Spreadsheet Detective is a Microsoft Excel Add-In that addresses this problem by documenting existing spreadsheets so as to make them easier to develop, understand and validate.  This section compares the Detective to in built facilities within Excel for reviewing spreadsheets.

Viewing Formulas

Excel spreadsheets are mainly developed and checked using the formula bar at the top of the screen which shows the text of individual formulas. However, even a tiny spreadsheet of 10 columns and 20 rows could contain 200 formulas, so validating spreadsheets using the formula bar is tedious and error prone

To assist with this Excel provides the Tools | Options | View | Formulas option to display formulas in each cell instead of their values. This enables all the formulas on a spreadsheet to be seen at once. However, this format is very cumbersome with formulas that do not fit within the cell being silently truncated. In practice many formulas do not fit and so the formulas bar still needs to be used. The cells are widened in an attempt to show at least part of the formulas, but this causes the sheet to become too large to effectively view or print.

The underlying problem is that by showing each formula in the sheet, most of the information that Excel displays is redundant. Most of the 200 formulas in a 10 by 20 spreadsheet would normally be copies of much fewer formula schemas. By just showing the schemas, the Spreadsheet Detective provides a succinct view of both the formulas and their calculated values and so makes it much easier to understand the assumptions behind a model.

Excel also provides an "Audit" menu. However, all this does is show the precedent and dependent cells for a given formula. While this can be useful for fixing certain types of errors after they have been detected, the menu is of minimal use when auditing a spreadsheet as a whole. The Spreadsheet Detective's Precedent Report also makes it much easier to track precedents of cells between different sheets.

A more useful auditing tool is the Edit | Go To | Special | Row Different / Column Different functions. These allow the user to select a range of cells and then manually move to each cell that has a different formula to the one in the previous row or column. However, this process is quite tedious because each range of similar cells needs to be selected and validated separately, so the option is normally only used on parts of a model after an error has already been detected. Further, these functions cannot highlight subtle problems with the range of array formulas. The graphical annotations used by the Spreadsheet Detective automatically make inconsistent formulas obvious.

Excel 97 can now colour the ranges of cells referred to by a formula as it is being edited provided they are on the same worksheet. While this may be of assistance to novice users, it does little to help ensure the integrity of an entire spreadsheet. Excel 97 has also improved its support for the correction of circular references by placing blue arrows between the cells that are involved. The Spreadsheet Detective goes further by showing which part of a formula is involved, and does not require that one cycle be fixed before another is shown. Excel does not provide any assistance for models that that deliberately use some circular references.

Naming Ranges

Excel enables ranges of cells to be given a Name and then be referenced by a name in a formula. For example, a formula might read "= Cost + Margin" rather than "= C12 + C13 " which makes it much easier to understand the formulaÆs meaning. Excel provides a facility to create Named ranges based on text labels in a spreadsheet. Names can also be automatically be "Applied" to a range of formulas which replaces each A1 reference in a formula with a Name if that is possible.

However, while Names are easy to create it can be quite difficult to check that the Names refer to the correct cells. Thus while a formula like " Cost + Margin" is easy to read, there is no guarantee that "Cost" and "Margin" refer to the correct cells. Names can be recreated should labels change, but the previous Names are not removed unless the new labels would cause them to be overridden. More importantly, the formulas that contained the old Names are not automatically updated and so can become quite misleading. Excel also has a complex notion of local and global names which can cause other subtle errors as illustrated in "[DetectEg.xls]Names" and "[DetectEg.xls]Actual". Formulas can also become unwieldy if Names have been automatically applied using ExcelÆs Insert | Names | Apply function, for example "=Sum(B2:C2)" could be converted into "=SUM((Name5 Name1):(Name5 Name2))".

For all these reason many of even the most experienced spreadsheet users avoid using Named ranges in practice. AutoNames make the use of Excel Names largely redundant, but if Names are used then The Spreadsheet Detective greatly reduces the problems with them by clearly showing which ranges they refer to and highlighting Names that are different from their text labels. The Spreadsheet Detective also avoids the problems of applying Names to formulas by only applying them to the Schemas, and not the original formulas. This makes it easy to update the schemas should the Names change. By showing both the original A1 reference and the Name, The Spreadsheet Detective also makes it easy to verify that the correct cells are being referenced.

"Natural Language" Formulas

Microsoft has attempted to address the problems with Names by introducing "Natural Language" formulas into Excel 97. These enable the user to manually enter a label as part of a formula in which case Excel will look for a cell with a compatible label and assume that that is the cell that is being referred to. Should the cell label change, the formula will automatically be updated.

Unfortunately, while reference natural language labels are easy to create, it can be very difficult to determine what cells they actually refer to. This can make even simple spreadsheets very difficult to audit if ExcelÆs assumptions are incorrect.

For example, the very simple spreadsheet in Figure 1 uses natural language references to model the Income and Expenses for the Rubber and Figures departments. (This spreadsheet has been included with the Spreadsheet Detective and is called "NatLanEg.xls".)

The Rubber department looks fine, but a close examination of the Figures department shows that its net profit for 1997 is 20 instead of û30. Further, the Sum to the right of the Figures Dept is also not correct. However, the formulas shown in the formula view in Figure 2 are exactly the same for both the Rubber and Figures department. This is a serious problem because it means that it is not possible to audit the spreadsheet by examining the text of the formulas within it. The only clue to the problems is that the Total has the rather strange formula "= Net + Net", but this formula is actually correct!

A second problem is that while natural language formulas can be easily copied, ExcelÆs "Go To Special" command is much less useful for checking the consistency of natural language formulas because it only compares the text of the formulas. This means that each formula in the Average or Sum columns would be considered to be different by the "Go To Special" command when in fact they should just be a copy of the first one. Thus while natural language formulas are superficially easy to use, they can make it extremely difficult to verify a spreadsheetÆs correctness. It is also worth noting that although Figure 2 contains 18 formulas, it only contains 4 schemas, so The Spreadsheet Detective can produce a much more succinct description of this model.

For all these reasons it is strongly recommended that natural language references never be used in spreadsheets for which correctness is important. In any case it is tedious having to type in the full natural language reference rather than a simple A1 reference, and natural language references cannot span multiple worksheets which is the most important case for models of even moderate complexity. "[DetectEg.xls]NatLang" shows how the same spreadsheet would be annotated by The Spreadsheet Detective which makes the errors clear.

Normal view of Rubbery Figures spreadsheet

Figure 1 Natural Language Figures
 

Excel formula view of Rubbery Figures

Figure 2 Natural Language Formula View


 

Comparing Sheets

Excel 97 has introduced the ability to share workbooks between multiple users simultaneously. As a by product of this feature it can also track changes that were made to cells in a spreadsheet. The tracking is quite comprehensive, and includes inserting, deleting and moving rows.

However, the ability to track changes is lost if the workbook is ever changed to normal, non-share mode. If the workbook is subsequently placed back into share mode, changes will continue to be tracked. However, any changes that were made before the most recent time that the spreadsheet was placed in share mode are simply lost and cannot be recovered except by manually comparing each cell.

In order to prevent this, it is possible to include a password that prevents people from being able to turn off share mode. However, it is impossible to change most aspects of a spreadsheet other than basic cell values and formulas in share mode. In particular, it is not possible to:-

Having more than one user update the same spreadsheet at the same time can also be confusing to the users, but it is not possible to track changes without enabling concurrent sharing. Thus there are may reasons why users may wish to suspend share mode. However, once this is done even for an instant any changes that had been made can no longer be tracked.

The Spreadsheet Detective compares two arbitrary spreadsheets without any need to have tracked the changes to them. This is safer and easier than relying on the sharing mechanism to be properly activated at all times. It also makes it possible to compare a spreadsheet to another spreadsheet which is not a direct ancestor.

Year 2000 Compliance

The Spreadsheet Detective is fully Year 2000 compliant.

Dates within the underlying Excel spreadsheet tool are stored as floating point numbers that enable dates to be correctly stored. Excel 95 can store dates up to 2078, while Excel 97 can store dates up to 9999 (http://www.microsoft.com/office/office97/documents/y2k/default.htm). If two digit dates are entered Excel makes sensible assumptions about which century to use, with numbers 00..29 being assumed to be in the 21st century for Excel 97 (http://support.microsoft.com/support/kb/articles/q164/4/06.asp). Full details can be found in http://www.microsoft.com/ithome/topics/year2k/product/excel97.htm.

The Spreadsheet Detective displays dates in statistics text boxes to indicate when it was last used to document a worksheet. These dates are always displayed with four digits. Any internal date processing is performed using Excel's standard floating point representation, and is therefor compliant.

The Spreadsheet Detective also provides tools for assisting users to ensure that their spreadsheets are also Year 2000 compliant, as described above. These include

The Spreadsheet Detective is an important tool because it addresses the serious problem of undetected spreadsheet errors that can invalidate strategic decisions based upon erroneous financial models. However, the tool is not mission critical for day to day operations. The tool also performs minimal date calculations, so it would be most unlikely to have undetected Year 2000 issues. Thus extensive customer Year 2000 testing is not recommended.

Version History

Version 5.8

Minor enhancements.  The print margins of annotated sheets are now minimized which became feasible by use of an old Excel 4 macro.  January 1999.

Version 5.7

This version improves international support.  It can show formulas in local language with local conventions for decimals etc.  November 1998.  The Precedent Workbook Report was also added.

Version 5.5

The Shareware version was released in October 1998. Minor enhancements include:-

Version 5.3

Very minor changes for Macintosh path names. (The Macintosh is not officially supported due to a lack of demand. However, preliminary testing suggests that the software works well on the Mac.)

Version 5.1

Released in August 1998, this version builds on 4.1. The main improvements are:-

Version 4.1

Released in May 1998, this version contains new features to compare and report spreadsheets and significantly improved AutoNames. The enhancements are:-

Version 3.1

This version was the first production version and was released on 1 November 1997. Enhancements include:-

Disclaimer & Contact Information

While Southern Cross Software Queensland (SCSQ) has taken care to produce a professional product SCSQ obviously does not accept any liability for the consequences of using the tool. The software is supplied "as is" and no warranties are made as to the toolÆs merchantability or fitness for any particular purpose. In no event shall SCSQ or anyone involved in the creation, production or delivery of the software be liable for any damages whatsoever arising out of the use of this software. The entire risk arising out of the use or performance of the software remains with the purchaser and user.

Other licence details are included in the Order page.

Southern Cross Software Queensland
Acn 079 368 200

EMail: Detective@uq.net.au (Preferred)
http:www.uq.net.au/detective
24 Railway St,
Woolloongabba, Qld 4102
Phone +61 414 991 474